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Bottlenose  dolphins  ( Tursiops  truncatus )  use  short,  wideband  pulses  for  echolocation.  Individual 
waveforms  have  high-range  resolution  capability  but  are  relatively  insensitive  to  range  rate. 
Signal-to-noise  ratio  (SNR)  is  not  greatly  improved  by  pulse  compression  because  each  waveform 
has  small  time-bandwidth  product.  The  dolphin,  however,  often  uses  many  pulses  to  interrogate  a 
target,  and  could  use  multipulse  processing  to  combine  the  resulting  echoes.  Multipulse  processing 
could  mitigate  the  small  SNR  improvement  from  pulse  compression,  and  could  greatly  improve 
range-rate  estimation,  moving  target  indication,  range  tracking,  and  acoustic  imaging.  All  these 
hypothetical  capabilities  depend  upon  the  animal’s  ability  to  combine  multiple  echoes  for  detection 
and/or  estimation.  An  experiment  to  test  multiecho  processing  in  a  dolphin  measured  detection  of  a 
stationary  target  when  the  number  N  of  available  target  echoes  was  increased,  using  synthetic 
echoes.  The  SNR  required  for  detection  decreased  as  the  number  of  available  echoes  increased,  as 
expected  for  multiecho  processing.  A  receiver  that  sums  binary-quantized  data  samples  from 
multiple  echoes  closely  models  the  N  dependence  of  the  SNR  required  by  the  dolphin.  Such  a 
receiver  has  distribution-tolerant  (nonparametric)  properties  that  make  it  robust  in  environments 
with  nonstationary  and/or  non-Gaussian  noise,  such  as  the  pulses  created  by  snapping 
shrimp.  ©  2003  Acoustical  Society  of  America.  [001:10.1121/1.1590969] 

PACS  numbers:  43.80.Lb,  43.66.Gf  [WA] 


I.  INTRODUCTION 

Active  echolocation  allows  bottlenose  dolphins  ( Tursi¬ 
ops  truncatus)  to  investigate  their  surroundings  using  hearing 
(see  Au,  1993  for  review).  Multiple  broadband,  short- 
duration  acoustic  “clicks”  are  emitted  by  the  dolphin.  Inter¬ 
action  of  the  emitted  signals  with  an  object  causes  echoes  to 
return  to  the  animal.  Echo  characteristics  are  influenced  by 
the  location,  orientation,  and  physical  attributes  of  the  object. 
By  listening  to  these  returning  echoes,  dolphins  are  able  to 
locate  and  identify  elements  in  their  environment  that  might 
be  difficult  to  detect  visually. 

Because  an  echo  is  potentially  generated  for  every  click 
that  impinges  on  an  object,  the  amount  of  information  avail¬ 
able  to  the  dolphin  increases  as  more  click-echo  pairs  are 
produced.  Much  research  has  focused  on  the  information 
contained  in  the  click-echo  pair  and  how  it  is  used  by  the 
dolphin  (Au,  1993;  Au  et  al.,  1988;  Busnel  and  Fish,  1980; 
Helweg  et  al.,  1996;  Nachtigall  and  Moore,  1988;  Thomas 
and  Kastelein,  1990).  The  manner  in  which  multiple  echoes 
clarify  or  add  information,  and  how  the  dolphin  utilizes  this 
information,  is  less  clear  (Dankiewicz  et  al,  2002;  Moore 
et  al.,  1991;  Roitblat  et  al. ,  1991).  Dependence  of  detection 
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performance  upon  the  number  of  available  echoes  has  been 
demonstrated  in  the  big  brown  bat  (Surlykke,  1998). 

Several  theories  exist  regarding  object  detectability  as  a 
function  of  the  number  of  observations  available  to  a  re¬ 
ceiver.  Dating  back  to  the  1950’s,  several  authors  have  in¬ 
vestigated  detection  of  multiple  acoustic  signals  in  noise. 
Green  and  Swets  (1988)  proposed  two  theories  to  account  for 
the  influence  of  multiple  observations  on  signal  detection 
performance.  The  first,  termed  the  observation-integration 
model,  assumes  that  the  subject  is  able  to  retain  information 
from  successive  presentations  over  a  certain  time  period.  De¬ 
tectability  is  improved  as  long  as  the  subject  is  able  to  suc¬ 
cessively  integrate  information  from  each  stimulus  presenta¬ 
tion.  The  second  model  is  based  on  threshold  theory,  and  is 
comparable  to  the  “multiple  looks”  model  of  temporal  inte¬ 
gration  (Viemeister  and  Wakefield,  1991).  In  this  model, 
each  stimulus  presentation  can  independently  excite  the  sen¬ 
sory  system.  Given  that  the  subject’s  momentary  threshold 
varies  with  time,  the  likelihood  of  the  stimulus  exceeding  the 
momentary  threshold  increases  with  the  number  of  stimulus 
presentations. 

Data  obtained  by  Swets  et  al.  (1964)  and  Swets  and 
Green  (1964)  lend  support  to  the  integration  model,  and 
show  that  performance  generally  increases  proportionally  to 
the  square  root  of  the  number  of  stimulus  presentations.  In  a 
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study  examining  the  effect  of  multiple  observations  on  sen¬ 
sory  thresholds,  Schafer  and  Shewmaker  (1953)  also  found 
that  thresholds  decreased  in  proportion  to  the  square  root  of 
the  number  of  presentations.  The  integration  model  implies 
that  the  detectability  index  of  a  set  of  N  presentations  equals 
the  square  root  of  the  sum-of-squares  of  the  detectability 
indices  for  the  individual  presentations  (Green  and  Swets, 
1988).  If  the  detectability  indices  for  the  individual  presen¬ 
tations  are  identical,  then  the  detectability  index  of  a  set  of  N 
presentations  equals  \[N  times  the  detectability  index  of  a 
single  presentation.  The  \[N  dependence  follows  from  the 
definition  of  the  detectability  index,  as  given  in  the  Appen¬ 
dix.  Although  many  different  integration  models  are  possible 
(e.g.,  linear  summation,  energy  detection,  and  binary  summa¬ 
tion),  all  such  models  have  detectability  indices  that  vary  as 
sIn. 

Multiecho  combining  is  relevant  to  many  sonar  capabili¬ 
ties,  e.g.,  range-rate  estimation  and  moving  target  indication 
(MTI)  with  short-duration,  Tursiops-Wke  waveforms,  target 
tracking,  and  acoustic  imaging  in  two  or  three  dimensions.  A 
logical  step  to  investigate  such  capabilities  in  dolphins  is  to 
perform  a  critical  experiment  that  ascertains  whether  the  dol¬ 
phin  is  capable  of  the  simplest  echo-combining  task,  which 
is  to  use  multiple  echoes  from  a  stationary  target  to  improve 
detection  performance.  If  an  accurate  receiver  model  can  be 
found,  i.e.,  a  model  that  accurately  describes  the  dolphin’s 
A-echo  stationary-target  detection  performance,  then  this 
model  may  be  applicable  to  more  sophisticated  dolphin 
echo-combining  operations. 

The  current  study  was  thus  designed  (i)  to  test  the  hy¬ 
pothesis  that  dolphins  combine  echoes  to  improve  signal  de¬ 
tectability;  and  (ii)  to  find  the  best  receiver  model  to  describe 
the  dolphin’s  performance.  A  dolphin  was  trained  to  report 
detection  of  synthetic  echoes  generated  by  computer  in  re¬ 
sponse  to  the  dolphin’s  clicks,  placing  the  number  of  echoes 
available  to  the  dolphin  under  experimental  control.  The  dol¬ 
phin’s  signal-detection  performance  was  assessed  when 
1,2, 4,8,  and  16  echoes  were  made  available.  Although  the 
available  number  of  echoes  ( N)  was  preset,  the  dolphin’s 
click  emission  rate  was  not  controlled.  The  number  of  emit¬ 
ted  dolphin  clicks  thus  could  be  much  larger  than  the  number 
N  of  available  echoes.  During  a  test  session,  half  of  the  trials 
contained  synthetic  echoes  in  noise  and  half  contained  noise 
only.  Echo  amplitudes  were  systematically  decreased  until 
detection  fell  to  chance.  At  least  two  such  thresholds  were 
taken  at  each  N  level. 

The  results  are  summarized  by  plotting  the  signal-to- 
noise  ratio  (SNR)  required  for  detection  as  a  function  of  the 
number  of  available  synthetic  echoes  that  could  be  used  by 
the  dolphin.  This  experimental  function  is  compared  to  the 
theoretical  detection  performance  of  three  receiver  models 
operating  in  additive,  white,  zero-mean  Gaussian  noise.  All 
the  receiver  models  initially  are  assumed  to  operate  on  one 
time  sample  (or  range  sample)  from  each  available  simulated 
echo,  yielding  N  time  samples  altogether,  where  N  is  the 
number  of  available  simulated  echoes.  The  three  models  are 
linear  summation,  square-law  summation  (energy  detection), 
and  summation  of  binary-quantized  sample  values  (binary 
M-out-of-A  detection). 
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FIG.  1.  Enlargement  of  GO  stimulus  pulse  and  corresponding  spectrum. 
When  more  than  one  pulse  was  allowed  per  trial,  pulse  separation  was 
constrained  to  be  no  smaller  than  8  ms. 


II.  METHODS 
A.  Subject 

The  subject  was  a  17-year-old  female  Atlantic  bottlenose 
dolphin  ( Tursiops  truncatus\  “CAS”).  Floating  pen  enclo¬ 
sures  on  San  Diego  Bay,  Space  and  Naval  Warfare  Systems 
Center  were  utilized  for  housing  and  experimental  sessions. 
The  subject  resided  with  a  small  dolphin  group  but  was  sepa¬ 
rated  from  them  when  sessions  were  conducted.  CAS’  hear¬ 
ing  was  measured  previously  and  shown  to  be  normal  (Brill 
et  al,  2001). 


B.  Synthetic  echoes  and  noise 

Conditions  for  behavioral  responding  were  contingent 
upon  two  types  of  computer-generated  stimuli.  The  NO-GO 
stimulus  consisted  of  Gaussian  noise  with  flat  power  spec¬ 
trum  over  the  echolocation  bandwidth  of  the  dolphin  (95  dB 
SPL  re:  1  /xPa2/Hz  between  10  and  150  kHz).  This  white 
noise  was  present  for  the  4-s  trial  duration.  The  ambient 
noise  in  San  Diego  Bay  had  a  power  spectrum  level  decreas¬ 
ing  from  approximately  80  dB  re:  1  /xPa2/Hz  at  10  kHz  to 
approximately  60  dB  re:  1  /tPa2/Hz  at  100  kHz,  measured 
with  one-octave  spectrum  analysis  filters.  The  ambient  noise 
level  has  increased  with  time  and  is  thus  larger  than  the  level 
reported  in  Au  (1993).  The  directivity  of  the  dolphin’s  re¬ 
ceiver  (Au,  1993)  further  reduced  the  effective  ambient  noise 
level  relative  to  the  NO-GO  stimulus.  At  50  kHz,  the  ambi¬ 
ent  noise  level  was  approximately  70  dB  re:  1  /rPa2/Hz. 
This  ambient  noise  level  was  25  dB  below  the  NO-GO 
stimulus  level,  and  was  38  dB  below  the  NO-GO  stimulus 
level  when  the  dolphin’s  directivity  index  is  considered. 

The  GO  stimulus  included  1,2,4,8,16,32,  or  64  pulses 
embedded  in  the  white  noise.  The  32-  and  64-pulse  condi¬ 
tions  were  utilized  during  training  phase  sessions  only.  The 
number  of  pulses  for  GO  stimulus  trials  did  not  vary  within 
a  session.  Each  pulse  was  a  triangle-windowed  50-kHz 
80-/as  sinusoid  (Fig.  1)  delivered  in  response  to  the  dolphin’s 
outgoing  echolocation  click.  An  8-ms  click-pulse  delay  was 
inserted  to  simulate  a  6-m  range.  The  total  range  of  the  arti¬ 
ficial  echoes  was  7m,  counting  propagation  time  between  the 
transducers  and  the  dolphin.  Although  the  dolphin’s  click 
emission  rate  was  not  experimentally  controlled,  it  was  in¬ 
fluenced  by  this  imposed  range  parameter  (Penner,  1988). 
Pulse  source  level  was  manipulated  to  determine  CAS’  de- 
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tection  thresholds  above  the  noise  floor.  No  attempt  was 
made  to  associate  or  equate  the  artificial  stimuli  with  echoes 
encountered  under  nonexperimental  conditions. 

C.  Apparatus 

Synthetic  echoes  were  generated  and  delivered  by  an 
electronic  synthetic  echo  system  (SES).  One  electronic  echo 
was  delivered  for  every  click  emitted  by  the  subject  up  to  the 
maximum  N  allowed  for  that  session  per  trial.  The  dolphin’s 
clicks  were  detected  by  a  Reson  TC4013  hydrophone  located 
0.64  m  directly  in  front  of  her  melon  and  triggered  a  single 
electronic  echo  if  the  received  level  exceeded  170  dB  re:  1 
/xPa.  Clicks  were  bandpass  filtered  (3-300  kHz)  and  ampli¬ 
fied  by  54  dB  before  reaching  a  multifunction  board  (Na¬ 
tional  Instruments  PCI  MIO-16E-1;  on  Pentium  PC)  where 
triggering  of  synthetic  echoes  previously  stored  to  RAM  oc¬ 
curred.  Upon  receiving  a  trigger,  the  SES  converted  a  digital 
waveform  to  an  analog  signal  that  was  then  filtered  (10-200 
kHz)  and  amplified  (20  dB)  by  a  DL  Electronics  4302  filter/ 
amplifier.  Analog  echoes  were  added  to  the  white  noise  using 
custom  hardware  and  projected  to  the  subject  by  a  second 
TC4013  hydrophone  located  0.7  m  beyond  the  trigger  hydro¬ 
phone.  The  echo  stimulus  thus  emanated  from  a  transducer 
that  was  1.34  m  from  the  dolphin’s  melon,  located  on  a  hori¬ 
zontal  line  directly  in  front  of  the  melon.  A  7-m  echo  range 
was  simulated  by  insertion  of  an  8-ms  delay  between  the 
trigger  event  and  output  of  an  echo  (12-m  electronic  delay 
plus  2-m  propagation  delay,  divided  by  2).  System  calibra¬ 
tion  included  SPL  measurements  of  TC4013  electronic  echo 
projection  by  an  ITC  6030  omnidirectional  hydrophone  lo¬ 
cated  at  the  subject’s  test  station  position.  Surface  reflections 
were  absorbed  and  dispersed  by  a  cluster  of  nylon-bristle 
brushes  placed  at  the  water  surface  midway  between  the  dol¬ 
phin  and  the  transducers. 

D.  Session  procedure 

CAS  was  positioned  at  an  intertrial  station  in  front  of  an 
experimenter  before  a  trial.  At  the  start  of  each  trial,  CAS 
was  cued  to  submerge  into  a  test  station  hoop  1.35  m  below 
the  water  surface  by  the  experimenter’s  hand  gesture.  An 
acoustically  opaque  screen  (a  PVC  sheet  covered  with 
closed-cell  foam  neoprene)  placed  in  front  of  the  hoop  was 
removed  and  the  SES  simultaneously  activated,  initiating 
white  noise  and  permitting  the  dolphin  to  begin  echolocating. 
The  4-s  white-noise  burst  defined  the  trial  duration  for  the 
dolphin.  To  report  a  signal-present  condition  (GO  response), 
the  dolphin  immediately  moved  to  a  nearby  paddle  and 
touched  it  with  her  rostrum.  To  indicate  the  absence  of  a 
signal  (NO-GO  response),  she  remained  stationary  in  the 
hoop  for  the  trial  duration  (4  s).  If  CAS  did  not  begin  move¬ 
ment  toward  the  paddle  before  the  end  of  the  4-s  window,  her 
response  was  classified  as  NO-GO.  CAS  typically  initiated  a 
GO  response  within  1-2  s.  Tone  and  fish  rewards  were  given 
for  every  correct  response.  An  equal  number  of  GO  and 
NO-GO  trials  was  presented  in  a  randomized  Gellermann 
series  (Gellermann,  1933).  The  likelihood  of  a  GO  following 
a  NO-GO  (or  the  reverse)  followed  a  0.5  first-order  condi¬ 
tional  probability  for  every  ten-trial  block.  The  dolphin’s  mo¬ 


tivation  to  perform  reliably  was  assessed  by  ten  warm-up 
trials  before  every  session,  with  an  80%-correct  response  rate 
required  in  order  for  a  test  session  to  ensue.  No  more  than 
one  experimental  session  was  conducted  in  a  day. 

E.  Threshold  titration 

Thresholds  were  estimated  for  both  training  and  testing 
phases  by  using  a  signal  amplitude  titration  method  (up/ 
down  staircase)  that  was  contingent  upon  the  dolphin’s  re¬ 
sponses  to  GO  stimulus  trials  (Moore  and  Schusterman, 
1987).  During  the  sessions,  the  experimenter  manipulated 
SPL  by  changing  the  voltage  value  of  the  synthetic  echo 
amplitude.  Initially,  GO  signal  amplitude  was  held  constant 
and  easily  discernible  for  the  first  ten  trials  of  the  session. 
After  the  first  ten  trials,  0.2-V  decrements  in  signal  ampli¬ 
tude  were  made  until  the  dolphin  responded  incorrectly.  Am¬ 
plitude  was  raised  in  0.1 -V  increments  until  the  dolphin  de¬ 
tected  the  signal  again.  All  subsequent  amplitude 
adjustments  were  in  0.1-V  steps,  with  decrements  made  after 
every  correct  GO  response  and  increases  after  every  incor¬ 
rect  response.  A  change  in  direction  of  amplitude  adjustment 
constituted  a  reversal,  and  a  threshold  estimate  was  taken 
after  ten  reversals  were  acquired  by  calculating  the  mean 
decibel  level  at  those  reversal  points  (50%-correct  detection 
rate).  As  CAS  became  experienced  with  the  task,  and  echo 
amplitudes  were  close  to  the  white-noise  floor,  the  titration 
deltas  were  changed  to  0.05  V.  Logarithmic  steps  (constant 
Aviv)  are  more  compatible  with  an  animal’s  sensitivity  to 
differences  than  constant  Av  steps,  but  constant  steps  are 
approximately  proportional  to  logarithmic  steps  when  the 
steps  are  small  relative  to  the  threshold  level  (Av^v). 

F.  Animal  training 

Training  the  stimulus-response  contingency  was  accom¬ 
plished  by  imposing  minimal  restrictions  on  the  GO  stimulus 
variables  in  an  effort  to  highlight  differences  from  the 
NO-GO  stimulus.  A  generous  number  of  synthetic  echoes 
were  provided  {N=  256)  and  signal  amplitude  was  held  ap¬ 
proximately  40  dB  above  the  noise  floor  (1.0  V).  Stimuli 
were  presented  in  three  to  six  same-trial  blocks,  e.g.,  four 
NO-GO  trials  followed  by  six  GO  trials.  Approximately  four 
sessions  were  required  before  the  appropriate  responding 
was  observed.  Trial  type  was  then  randomized  as  described 
previously.  Once  responding  was  stable,  CAS  was  intro¬ 
duced  to  a  reduction  in  N.  One  to  eight  sessions  (,v)  were 
conducted  at  successively  lower  N  levels  as  follows:  N 
=  64  (.y  =  8);  N=  32  (.v  =  3 ) ;  N=  16  (s  =  5);  N=  8  (s=  1); 
N=  4  (,y  =  5);  N=  2  (,v  =  2);  and  N=  1  (.y  =  3).  Thresholds 
were  estimated  during  the  final  three  sessions  at  N=  64,  16, 
4,  and  1,  and  during  the  final  session  only  at  N=32,  8,  and  2. 
Once  CAS  demonstrated  stable  performance  at  the  minimum 
N  level  (N=  1),  as  indicated  by  the  threshold  session  results 
at  N=  1,  no  further  training  was  undertaken. 

G.  Testing 

Exposure  to  all  experimental  conditions  was  completed 
in  the  training  phase  so  that  testing-phase  thresholds  were 
free  of  novelty  effects.  Two  ( N=  1,4,8,16)  or  three  (N=  2) 
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N-Echoes  Training  Sessions 


FIG.  2.  Average  detection  threshold  in  decibels  obtained  during  training 
sessions  when  the  number  of  available  echoes  per  trial  (N)  was  1,  4,  16,  64 
(number  of  measured  thresholds  =3)  or  2,  8,  32  (number  of  measured 
thresholds  =1).  Error  bars  for  the  trials  with  three  measured  thresholds 
represent  SEM  (standard  error  of  the  mean). 

final  thresholds  were  obtained  in  which  signal  detection  per¬ 
formance  as  a  function  of  N  was  assessed.  N  was  held  con¬ 
stant  during  a  session  while  signal  amplitude  was  titrated  as 
described  previously. 

H.  Calculating  thresholds 

Recall  that  the  experimenter  manipulated  SPL  by  chang¬ 
ing  the  voltage  value  of  the  synthetic  echo  amplitude.  Syn¬ 
thesized  white,  Gaussian  noise  was  held  constant  at  95  dB 
re:  1  p. Pa2/Hz.  Thresholds  were  computed  as  signal-to- 
noise  ratios  (SNR),  the  required  echo  amplitudes  A  (N)  for 
detection  with  N  available  echoes,  divided  by  the  rms  noise 
power.  The  bandwidth  for  rms  noise  power  was  estimated 
using  Q  derived  from  critical  band  measures  of  the  bottle- 
nose  dolphin  receiver.  Q  was  approximately  2.2  for  signals 
with  center  frequency  of  60  kHz  (Au  and  Moore,  1990).  The 
synthetic  signals  used  in  this  study  had  center  frequency  of 
50  kHz;  thus,  noise  bandwidth  was  estimated  to  be  approxi¬ 
mately  22.72  kHz.  The  calibrated  system  permitted  conver¬ 
sion  of  the  voltages  A  ( N )  to  dB,  thereby  allowing  computa¬ 
tion  of  SNR  in  dB  by  subtracting  rms  noise  power  (dB)  from 
synthetic  echo  amplitude  A(N)  (dB).  Importantly,  note  that 
SNR  was  computed  per  echo ,  without  weighting  for  the 
number  of  available  echoes  N. 

III.  RESULTS 
A.  Animal  training 

Figure  2  shows  results  of  the  initial  detection  threshold 
sessions  that  were  conducted  at  each  N  level.  N  is  presented 
on  the  horizontal  axis,  with  sessions  represented  left-to-right 
in  the  opposite  order  in  which  they  were  conducted.  Detec¬ 
tion  performance  was  strong  for  N  values  of  64,  32,  and  16, 
although  sporadic  threshold  elevations  were  seen.  It  is  likely 
that  these  variations  represent  CAS’  growing  familiarization 
with  the  manipulation  of  N  while  the  thresholds  were  being 
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FIG.  3.  Test  session  data:  (A)  Final  detection  thresholds  in  decibels  when 
the  number  of  available  echoes  per  trial  ( N)  was  1,  4,  8,  16 
(number  of  measured  thresholds  =2)  or  2  (number  of  measured  thresholds 
=  3).  The  two  measured  thresholds  at  N=  1  were  equal  and  the  dots  at  N 
=  1  therefore  overlap.  (B)  Average  number  of  clicks  emitted  per  trial  at  each 
N  level  (pooled  sessions).  Error  bars  represent  SEM.  Significantly  more 
clicks  were  emitted  at  N=  1  compared  to  all  other  levels  (Tukey-Kramer, 
«  =  0.05). 

titrated.  At  N  values  of  8  and  4,  SNR  required  for  detection 
increased,  and  was  highest  when  N  was  held  at  2  and  1.  The 
mean  false-alarm  rate  for  the  threshold  sessions  was  0.088 
(s.d.  =  0.069).  Click  emission  was  tracked  for  every  trial  and 
results  showed  that  CAS  always  emitted  enough  clicks  to 
receive  the  maximum  number  of  echoes  that  were  allowed 
(mean  clicks  per  trial  =80). 

B.  Threshold  testing 

The  top  panel  of  Fig.  3  summarizes  CAS’  detection 
thresholds  that  were  estimated  during  the  test  phase  of  the 
experiment.  N  is  presented  on  the  horizontal  axis,  with  suc¬ 
cessive  sessions  represented  in  left-to-right  order.  The  two 
estimated  threshold  SNR  values  for  N=  1  were  identical,  and 
are  thus  represented  by  a  single  point  in  Fig.  3.  Detection 
thresholds  were  lower  overall  at  each  N  level  when  com¬ 
pared  to  the  training  session  thresholds,  perhaps  due  to  in¬ 
creased  familiarization  with  the  task.  Mean  false-alarm  rate 
was  0.034  (s.d.= 0.042).  The  thresholds  are  well  behaved, 
with  SNR  required  for  detection  falling  off  monotonically  as 
the  number  of  echoes  ( N)  is  increased.  Recall  that  SNR  is 
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(A)  Linear  Summation  Model 


(B)  Energy  Detection  Mode! 


(C)  Binary  Summation  (M-out-of-N  Detection)  Model 


FIG.  4.  Receiver  models  that  are  compared  with  dolphin  tV-echo  detection  data.  (A)  linear  summation;  (B)  energy  detection;  (C)  binary  summation  (A/-out-of- 
N  detection).  HI  is  the  signal-plus-noise  hypothesis  corresponding  to  the  GO  stimulus.  HO  is  the  noise-only  hypothesis  corresponding  to  the  NO-GO  stimulus. 


measured  per  echo.  The  trend  is  consistent  with  those  from 
human  listeners  showing  that  detection  improved  as  the 
number  of  signals  was  increased  (Green  and  Swets,  1988; 
Swets  and  Green,  1964).  These  results  support  the  inference 
that  CAS  was  able  to  combine  multiple  echoes  in  her  bio¬ 
logical  signal-processing  system. 

The  lower  panel  of  Fig.  3  summarizes  the  distribution  of 
clicks  emitted  by  CAS  that  were  above  170  dB  during  each 
session.  Analysis  by  a  one-way  ANOVA  showed  a  difference 
in  click  production  as  a  function  of  N ,  F(4,524)  =  5.4,  p 
<0.0003.  Comparison  among  the  means  using  the  Tukey- 
Kramer  test  revealed  that  CAS  emitted  significantly  more 
clicks  for  the  N=  1  condition  than  all  the  others  (a  =  0.05, 
two-tail),  supporting  the  notion  that  this  condition  was  more 
difficult  than  N= 2,  4,  8,  and  16.  Further  evidence  for  diffi¬ 
culty  at  N=  1  is  that  the  change  in  threshold  for  the  N=  1 
condition  compared  to  the  N=  2  condition  was  12  dB, 
whereas  the  change  in  threshold  between  the  other  condi¬ 
tions  (N=  2  through  N=  16)  was  almost  a  consistent  4-dB 
change.  The  mean  number  of  clicks  emitted  during  all  testing 
sessions  was  70,  only  a  slight  decrease  from  the  average 
number  emitted  during  training  sessions.  CAS  always  emit¬ 
ted  more  clicks  than  echoes  that  were  available,  thus  ensur¬ 
ing  that  she  received  all  available  synthetic  echoes. 

C.  Receiver  models 

Various  models  of  animal  echolocation  have  been  em¬ 
ployed  to  understand  the  signal-processing  operations  that 


may  be  used  by  the  animals  and  to  guide  the  design  of  broad¬ 
band  sonar  systems  that  attempt  to  emulate  animal  capabili¬ 
ties.  Receiver  models  that  incorporate  summation  or  integra¬ 
tion  are  relevant  to  this  inquiry.  Well-known  integration 
models  pertain  to  summation  over  intervals  in  range/delay/ 
time  (critical  intervals;  Vel’min  and  Dubrovskiy,  1976)  and 
over  intervals  in  frequency  (critical  bands;  Johnson,  1968). 
The  synthetic  echo  in  Fig.  1  fits  within  a  single  critical  in¬ 
terval  and  a  single  critical  band  for  a  critical  bandwidth  of 
22.72  kHz  at  a  frequency  of  50  kHz  (Au  and  Moore,  1990). 
This  study  addresses  integration  along  a  different  dimension, 
correspnding  to  the  number  of  click-echo  pairs  (Floyd, 
1980;  Surlykke,  2003).  A  critical  N  value,  corresponding  to 
the  maximum  number  of  echoes  that  can  be  integrated  by  the 
dolphin,  has  yet  to  be  determined.  Figure  3  implies  that  the 
critical  N  is  greater  than  16. 

Three  integration  models  are  considered  here  in  order  to 
better  understand  the  SNR  required  by  a  dolphin  for  target 
detection  when  the  number  of  available  echoes  is  varied. 
These  models  correspond  to  linear  summation,  energy  detec¬ 
tion,  and  summation  of  binary-quantized  echo  data  (binary 
AZ-out-of-yV  detection).  The  operations  performed  by  the 
three  models  are  illustrated  in  Fig.  4. 

The  V-echo  detection  performance  of  the  three  receivers 
is  predicted  by  the  analysis  in  the  Appendix.  For  the  linear 
summation,  energy  detection,  and  binary  AZ-out-of-JV  receiv¬ 
ers,  the  required  echo  amplitudes  A(N)  for  detection  with  N 
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available  echoes,  divided  by  the  rms  noise  power  ar,  are 
[A(N)/ali0=c,/ylN,  (1) 

[A(N)/a]cg^(ce/4N)[[+sl[+(2N/c2c)Vr2,  (2) 

[A (N)/cr]bin=er&-  '(p0)~  erfc^ '(/>,),  (3) 

where  c/  and  cc  are  constants,  erfc"1^)  is  the  inverse 


complementary  error  function  (Fig.  5),  p0  is  the  probability 
that  the  threshold  level  of  the  binary  quantizer  is  exceeded 
when  only  noise  is  present  (the  HO  hypothesis),  and  p  (  is  the 
probability  that  the  binary  quantizer  threshold  is  exceeded 
when  both  signal  and  noise  are  present  (the  HI  hypothesis). 
The  function  erfc“'(p)  is  related  to  the  probit  transformation 
(Collett,  1952), 


erfc„.  '(/?)  =  probit(  1  —p). 

In  the  binary  summation  model,  probability  p,  depends  upon  p0,  N,  and  a  constant  ch 


(4) 


(2po  +  ch  IN)  +  V(2p0  +  ch  IN)2  -  4p0(  1  +  ch  /N)[p0(  1  +  ch  IN)  -  ch  / N ] 

2(1  +chIN) 


(5) 


For  a  prespecified  value  of p0,  all  the  A(N)/rr  expressions  in 
(l)-(3)  depend  on  a  constant  ( c ,,  ce,  or  ch )  and  on  the 
number  of  available  echoes,  N. 


D.  Comparison  of  theoretical  performance  with 
dolphin  data 

To  compare  the  receiver  models  with  dolphin  detection 
data,  the  parameters  c/,  ct, ,  p0,  and  ch  are  adjusted  to  pro¬ 
vide  a  minimum  mean-square  error  (MMSE)  fit  between  the 
values  of  [A (N)/a]mM  in  (l)-(4)  and  the  average  experi¬ 
mental  value  of  A{N)/<r  for  each  N  value.  The  correlation 
coefficients  between  the  average  data  points  and  their  theo¬ 
retical  counterparts  are  computed  for  each  model.  For  visual 
comparison,  the  data  points  and  theoretical  curves  are  plotted 
together  in  Fig.  6  on  a  decibel  scale,  showing 
201og|0[y4(.Af)/er]  vs  N.  Figure  6  illustrates  that  the  best  fit 
(by  far)  is  obtained  with  the  binary  M-out-of-A^  receiver. 

The  best-fit  parameters  and  data-model  correlation  coef¬ 
ficients  r  are  as  follows: 

Linear  summation  model:  C/=4.58,  rt=  0.9055,  (6) 


x 

FIG.  5.  The  inverse  complementary  error  function  erfc~'(x). 
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Energy  detection  model:  ce=  3.12,  re  =  0.9095,  (7) 

Binary  M-out-of-T/  detection: 

p0  =  0.5,  ch= 0.999'  999  3,  r*  =  0.9997.  (8) 

The  more  accurate  specification  of  ch  is  necessitated  by  re¬ 
ceiver  operation  on  a  steep  part  of  the  curve  in  Fig.  5  when 
N~  1,  as  discussed  in  the  Appendix. 

The  binary  M-out-of-A  detector  seems  to  have  an  unfair 
advantage  because  two  parameters  can  be  varied  instead  of 
one,  providing  an  extra  degree  of  freedom  for  data  fitting. 
The  extra  degree  of  freedom  is  eliminated  by  choosing  a 
prior  value  for  p0 .  Choosing  a  fixed  p0  value  is  equivalent  to 
choosing  a  threshold  for  binary  quantization.  The  most  ap¬ 
propriate  prior  choice  for  the  binary  quantization  threshold  is 
zero,  which  implies  that  p0  =  0.5  for  all  symmetric,  zero- 
mean  noise  distributions,  independent  of  the  noise  power  cr2. 


FIG.  6.  MMSE  (minimum  mean-square  error)  fits  between  three  receiver 
models  and  dolphin  W-echo  detection  data.  The  MMSE  algorithm  compares 
models  to  noise-normalized  data  amplitudes  A(N)/<r,  but  the  curves  are 
shown  on  a  dB  scale  corresponding  to  SNR[dB]=20  log](J/f  (A^/ir], 
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Variation  of  p0  can  be  used  to  check  the  results,  since  the 
best  p 0  value  for  nonparametric  operation  is  known  to  be 
0.5. 

IV.  DISCUSSION 

A.  Estimation  of  detectability  index 

The  analysis  in  the  Appendix  indicates  that  the  constants 
C/,  ce ,  and  ch  are  related  to  the  corresponding  detectability 
indices  by  the  equations 

ct=di\  cc=de ;  ch=d\!2.  (9) 

The  detectability  indices  corresponding  to  the  MMSE  esti¬ 
mates  of  c/,  ce ,  and  ch  are 

<4=4.58;  </t,  =  3.12;  <4=  1.41,  (10) 

for  the  linear  summation,  energy  detection,  and  binary 
AZ-out-of-A  receivers,  respectively.  These  performance  mea¬ 
sures  are  based  on  a  restricted  set  of  A  values  equal  to  1,2, 
4,  8,  and  16.  The  receiver  model  with  the  best  fit  to  the  data 
has  the  worst  performance  in  zero-mean  Gaussian  noise. 

B.  Distribution  tolerance  of  the  binary  summation 
model 

The  linear  receiver  is  optimum  for  Gaussian  noise  with 
known  variance,  but  the  binary  AZ-out-of-A  processor  has 
distribution-tolerant  (nonparametric)  properties.  The  false- 
alarm  rate  of  the  binary  AZ-out-of-A  receiver  is  insensitive  to 
time-varying  noise  power  and  to  the  shape  of  any  symmetric, 
zero-mean  noise  distribution.  If  the  binary  AZ-out-of-A  re¬ 
ceiver  is  a  viable  model  for  dolphin  multiecho  processing, 
then  the  dolphin  has  traded  optimality  in  Gaussian  noise  with 
specified  noise  power  for  robustness  with  respect  to  the  dis¬ 
tribution  and  power  of  the  noise.  The  performance  disparity 
between  binary  and  linear  summation  is  not  large  if  many 
echoes  are  used.  Figure  6  implies  that  large  A  is  associated 
with  small  SNR  for  all  of  the  models.  For  large  A  and  small 
SNR,  it  is  shown  in  the  Appendix  that  d 1 .25 dh . 

C.  Preprocessing  with  an  auditory  transduction  model 

Figure  6  illustrates  the  performance  of  an  AZ-out-of-A 
receiver  with  zero  binary  quantization  threshold.  This  perfor¬ 
mance  is  unaffected  by  preprocessing  input  data  with  a  sign¬ 
preserving  zero-memory  nonlinear  transformation.  Two  ex¬ 
amples  of  such  a  transformation  are  (1)  a  half-wave  rectifier 
and  (2)  membrane  potential  as  a  function  of  the  displacement 
of  either  inner  or  outer  hair  cells  (Russell  et  al.,  1986;  Moun¬ 
tain  and  Hubbard,  1996).  The  binary  summation  model  is 
insensitive  to  the  nonlinear  signal  transformation  that  occurs 
during  cochlear  transduction  from  acoustic  waveforms  to 
neuronal  excitations. 

An  envelope  preprocessor  is  approximated  by  a 
weighted  average  of  neighboring  half-wave  rectified  data 
samples.  A  receiver  model  that  uses  envelope  detection  prior 
to  binary  quantization  and  AZ-out-of-A  detection  cannot  be 
ruled  out  with  current  data. 


D.  Phase  sensitivity 

Phase  sensitivity  of  the  dolphin  A-echo  receiver  model 
is  still  an  open  question.  If  the  data  are  linearly  processed, 
half-wave  rectified,  or  passed  through  a  zero-memory  hair 
cell  model  before  binary  quantization,  then  phase  sensitivity 
depends  upon  maintaining  an  accurate  sampling  time  relative 
to  the  time  of  signal  transmission.  Multiple,  parallel 
AZ-out-of-A  detectors  can  be  used  to  test  hypothesized  sam¬ 
pling  times,  to  compensate  for  sampling  time  errors,  and  to 
generate  range-tracking  information.  If  an  envelope  detector 
that  forms  a  weighted  sum  of  neighboring  half-wave  rectifier 
outputs  is  used  as  a  preprocessor,  then  receiver  tolerance  to 
sampling  time  errors  is  increased  but  phase  sensitivity  is  re¬ 
duced. 

To  test  phase  sensitivity,  the  waveform  in  Fig.  1  can  be 
replaced  with  a  signal  that  has  a  short-duration,  high- 
amplitude  positive  peak  followed  by  (or  surrounded  by) 
long-duration,  small-amplitude  negative  components  with 
the  same  total  area  as  the  positive  peak.  Phase  reversal  of  this 
waveform  (multiplication  by  —  1 )  should  affect  the  detection 
performance  of  a  phase-sensitive  receiver. 


E.  Postdetection  integration 

The  binary  AZ-out-of-A  receiver  model  is  equivalent  to  a 
postprocessor  for  a  detector  that  makes  an  HI  versus  HO 
decision  at  each  range  sample,  for  each  click-echo  pair.  The 
binary  decision  variable  at  a  given  range  is  integrated  or 
counted  over  successive  click-echo  pairs.  A  different  strategy 
is  followed  by  the  linear  and  quadratic  summation  models, 
which  do  not  implement  a  decision  until  all  relevant  data  are 
summed.  The  latter  strategy  is  generally  regarded  as  superior 
because  the  level  of  each  detector  output  is  preserved  and  no 
information  is  lost  via  premature  decision  making.  A  large 
transient  interference  pulse  (e.g.,  from  a  snapping  shrimp), 
however,  will  have  a  much  larger  effect  on  linear  or  qua¬ 
dratic  summation  than  upon  summation  of  binary  decision 
variables  (Bullock,  1986).  In  San  Diego  Bay  and  many  other 
locations,  snapping  shrimp  are  an  important  source  of  inter¬ 
ference  for  dolphin  echolocation  (Au  and  Banks,  1998). 
Aside  from  interference  considerations,  the  dynamic  range 
tolerance  provided  by  binary  quantization  may  be  important 
for  detection,  tracking,  and  acoustic  imaging  of  prey  with 
large  aspect-dependent  variation  in  target  strength. 


F.  Biological  neural  networks 

The  binary  AZ-out-of-A  detector  model  is  a  basic  build¬ 
ing  block  for  neural  networks  that  process  action  potentials, 
which  are  binary,  all-or-none  signals.  A  neuronal  version  of  a 
binary  AZ-out-of-A  processor  could  use  binary  sampling  in¬ 
tervals  corresponding  to  the  width  of  an  action  potential 
spike.  If  the  intensity  of  a  stimulus  is  encoded  by  the  density 
of  action  potential  spikes  and/or  the  duration  of  a  spike  se¬ 
quence,  the  binary  AZ-out-of-A  processor  can  function  as  a 
stimulus  intensity  or  amplitude  decoder,  despite  the  ampli¬ 
tude  insensitivity  associated  with  binary  quantization. 
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G.  Polarity  coincidence  correlation  and  binaural 
localization  models 

The  binary  M-out-of-JV  detector  is  a  polarity  coinci¬ 
dence  correlator  with  a  constant,  unit  reference  function.  The 
polarity  coincidence  correlator  is  well  known  for  its  relative 
insensitivity  to  the  probability  distribution  of  input  data 
(Wolff  et  al.,  1962).  A  neurophysiological  model  for  binaural 
localization  (interaural  time  delay  estimation)  uses  coinci¬ 
dence  of  the  excitations  in  two  neural  delay  lines,  one  from 
each  ear  (Jeffress,  1948;  Konishi,  1993;  Colburn,  1996).  This 
binaural  model  is  similar  to  polarity  coincidence  correlation 
and  thus  to  the  binary  A/-out-of -N  receiver  model.  In  bio¬ 
logical  sonar  systems,  interaural  polarity  coincidence  corre¬ 
lation  can  be  used  for  azimuth  estimation  (and  for  elevation 
estimation  if  the  animal  rolls  by  90  degrees).  Range  tracking 
can  be  implemented  via  cross  correlation  of  successive  ech¬ 
oes,  using  a  polarity  coincidence  correlator. 


H.  Binary  quantization  and  zero  crossings 

A  binary  waveform  representation  preserves  information 
about  real  zero  crossings,  which  are  important  signal  at¬ 
tributes  (Kedem,  1994;  Marr,  1982;  Requicha,  1980;  Voel- 
cker,  1966a,  1966b).  A  polarity  coincidence  correlator  can 
use  these  attributes  for  detection,  estimation,  classification, 
and  decomposition  via  Haar  functions  (Hagen  and  Farley, 
1973;  Vetterli  and  Kovacevic,  1995). 


I.  Capabilities  of  a  multipulse  sonar  receiver  that 
uses  binary  summation 

The  ability  of  a  dolphin  to  combine  information  from 
multiple  pulse-echo  pairs  is  necessary  for  advanced  signal¬ 
processing  capabilities.  One  of  these  capabilities  is  acoustic 
imaging  via  a  simplified  version  of  synthetic  aperture  sonar 
(SAS)  processing  (Altes,  1995;  Altes  et  al.,  1998;  Altes, 
2003).  SAS  images  can  be  formed  by  adding  the  echo 
sample  at  each  range  to  an  appropriate  pixel  in  a  two-  or 
three-dimensional  image.  The  image  is  sequentially  con¬ 
structed  as  multiple  echoes  are  obtained  from  different  as¬ 
pects.  High-resolution  SAS  images  have  been  created  from 
binary-quantized  sonar  echo  envelopes  (with  a  nonzero 
quantization  threshold),  using  dolphin-like  transmitted  wave¬ 
forms  (Altes,  personal  observation).  After  N  echoes  are  pro¬ 
cessed,  each  pixel  level  in  such  an  image  represents  the  re¬ 
sponse  of  a  binary  M-out-of-N  receiver. 

The  dolphin  may  use  multipulse  processing  to  estimate 
range-rate,  implement  a  moving  target  indicator  (MTI),  track 
targets,  and  perform  acoustic  imaging  in  two  or  three  dimen¬ 
sions.  Figure  6  implies  that  the  dolphin  can  perform  robust 
integration  along  a  constant-range  line  in  the  range,  echo- 
number  ( R,N)  plane.  Range-rate  estimation,  tracking,  and 
acoustic  imaging  involve  integration  along  other  lines  or 
curves  in  the  R,N  plane.  The  simplest  moving  target  indica¬ 
tor  computes  the  difference  between  successive  detector  out¬ 
puts  along  a  constant-range  line  in  the  R, N  plane.  A  more 
sophisticated  MTI  uses  a  weighted  sum  of  such  outputs,  with 
positive  and  negative  weights. 


J.  Adaptability  of  the  dolphin  receiver 

The  power  spectrum  of  artificially  added  Gaussian  noise 
(white  over  the  dolphin’s  echolocation  bandwidth)  was  95 
dB  re:  1  /U,Pa2/Hz.  When  an  extra  13  dB  is  added  to  account 
for  the  dolphin’s  directivity  index,  the  artificial  Gaussian 
noise  was  108  dB  above  spatially  uniform  noise  with  a 
power  spectral  density  of  1  /u.Pa2/Hz,  and  38  dB  above  the 
average  ambient  noise  level  of  70  dB  re:  1  /xPa2/Hz.  The 
ambient  level  included  average  snapping  shrimp  interference 
and  the  sounds  of  other  dolphins  in  the  area.  Despite  the 
38-dB  difference,  the  analysis  indicates  that  the  dolphin’s 
receiver  did  not  adapt  to  operation  in  Gaussian  noise  as  op¬ 
posed  to  strong  transient  interference.  This  apparent  lack  of 
adaptability  can  be  explained  by  the  large  disparity  between 
peak  and  average  levels  of  transient  sounds.  At  a  distance  of 
1  meter,  the  power  spectral  density  of  a  single  snapping 
shrimp  pulse  is  between  105  and  111  dB  re:  1  /xPa2/Hz  (Au 
and  Banks,  1998;  Versluis  et  al.,  2000).  Spherical  spreading 
decreases  this  level  by  20  log  r  if  the  shrimp  is  r  meters  from 
the  receiver.  A  spatially  averaged  interference  power  spectral 
density  level  of  108  dB  re:  1  /rPa2/Hz  is  required  to  make 
the  interference  power  spectrum  equal  to  the  power  spectrum 
of  the  artificially  added  Gaussian  noise  at  the  input  to  the 
dolphin’s  receiver.  The  possible  presence  of  nearby  snapping 
shrimp  thus  could  have  constrained  the  dolphin’s  receiver 
design.  The  experiment  was  performed  in  an  area  with  float¬ 
ing  walkways  to  support  trainers  and  equipment.  Snapping 
shrimp  appear  to  congregate  in  such  areas  (Ferguson  and 
Cleary,  2001). 


V.  CONCLUSION 

Synthetic  echoes  in  additive  noise  were  used  to  estimate 
the  SNR  required  by  a  dolphin  for  detection  when  the  num¬ 
ber  of  available  echoes  was  varied.  A  close  fit  to  the  dol¬ 
phin’s  performance  data  was  obtained  with  a  receiver  model 
that  sums  binary-quantized  time  samples  from  N  available 
echoes.  This  detector  does  not  perform  as  well  as  linear  sum¬ 
mation  for  Gaussian  noise  with  known  average  power,  but  its 
false-alarm  rate  is  distribution  tolerant  and  nonparametric 
with  respect  to  variable  noise  power.  The  dolphin’s  acoustic 
environment  is,  in  fact,  notoriously  non-Gaussian  and  non¬ 
stationary  (Urick,  1975).  The  close  fit  of  the  binary 
Af-out-of-jV  model  to  the  dolphin  detection  data,  together 
with  the  relatively  poor  fit  of  the  linear  and  energy  summa¬ 
tion  models,  implies  that  the  dolphin  trades  optimality  (in 
Gaussian  noise  with  known  power)  for  robustness. 

Further  experiments  are  needed  to  determine  whether 
the  dolphin’s  time  sampling  is  sufficiently  precise  to  allow 
phase  sensitivity  in  the  context  of  the  binary  summation 
model.  Even  without  phase  sensitivity,  binary  summation  can 
be  used  for  acoustic  imaging.  Binary  M-out-of-A  detection 
is  a  special  case  of  polarity  coincidence  correlation  (a  model 
for  binaural  localization),  and  it  is  similar  to  operations  per¬ 
formed  by  biological  neural  networks.  Another  question  is 
whether  the  dolphin’s  receiver  can  adapt  to  optimum  opera¬ 
tion  in  Gaussian  noise  when  snapping  shrimp  are  not  present. 
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APPENDIX:  DETECTABILITY  INDEX  DERIVATIONS 

The  output  mean  values  and  variances  from  each  re¬ 
ceiver  model  in  Fig.  4  can  be  used  to  predict  the  detectability 
index  at  the  receiver  output.  The  detectability  index  is 
closely  related  to  the  Fisher  ratio  and  to  the  t-statistic.  Re¬ 
ceiver  performance  (detection,  false  alarm,  and  error  prob¬ 
abilities)  can  be  computed  from  the  detectability  index  if  the 
output  distributions  are  Gaussian  with  equal  variance  under 
noise-only  and  signal-plus-noise  conditions  (Van  Trees, 
1968).  This  condition  seldom  applies  to  nonlinear  receivers. 
Detection  performance,  however,  almost  always  varies 
monotonically  with  the  detectability  index,  which  is  defined 
as  the  difference  between  the  output  mean  value  when  hy¬ 
pothesis  HI  (signal  plus  noise)  is  true  and  the  output  mean 
value  when  HO  (noise  alone)  is  true,  divided  by  the  square 
root  of  the  average  output  variance  for  the  two  hypotheses. 

Addition  of  N  independent  receiver  outputs  with  fixed 
SNR  causes  the  mean  output  and  variance  to  be  multiplied 
by  N ,  for  both  HI  and  HO.  The  detectability  index  for  the 
sum  is  then  the  detectability  index  for  a  single  observation 
multiplied  by  the  square  root  of  N.  Equivalently,  the  detect¬ 
ability  index  of  the  sum  is  the  square  root  of  the  sum-of- 
squares  of  the  detectability  indices  for  the  individual  obser¬ 
vations  as  in  Green  and  Swets  (1988),  regardless  of  the 
receiver  model,  for  constant  SNR. 

In  the  following  analysis,  the  detectability  index  of  each 
receiver  model  is  assumed  to  vary  monotonically  with  detec¬ 
tion  performance  and  to  be  constant  for  all  values  of  N,  the 
number  of  echoes  available  for  detection.  A  constant  detect¬ 
ability  index  for  all  N  values  implies  that  SNR  decreases 
with  N.  These  assumptions  imply  that  each  receiver  model 
(as  well  as  the  dolphin)  uses  a  consistent  performance  crite¬ 
rion  (detection,  false  alarm,  and  error  probability)  for  deci¬ 
sion  making  at  all  N  values. 

Since  the  detectability  index  d  depends  upon  SNR  and 
A,  it  should  be  possible  to  obtain  an  expression  for  SNR 
=  20  log[/)(Af)/(j']  as  a  function  of  d  and  N.  The  noise- 
normalized  echo  amplitude  required  for  detection,  A(N)/a, 
depends  upon  the  detectability  index  d  and  the  number  of 
available  echoes  N.  For  the  binary  summation  model, 
A  ( N)/cr  also  depends  upon  the  threshold  yh  for  binary  quan¬ 
tization,  or  equivalently,  on  the  probability  p0  of  a  threshold 
crossing  when  HO  is  true.  In  general 

[A(N)/cr  ]thcory  =frccvr(d,N,  yh),  (Al) 

where  the  function  f,^x{d,N,yh)  depends  upon  the  receiver 
model. 


The  experimental  results  yield  the  average  signal-to- 
noise  ratio  (SNR)  in  decibels  at  specific  N  values,  and 

[A(N)Z  cr]cxpt=  ioSNR/2°.  (A2) 

For  each  receiver  model,  a  gradient  descent  algorithm  is  used 
to  find  the  value  of  d  (and  p0  or  yh  if  one  of  these  quantities 
is  not  prespecified)  that  minimizes  the  mean-square  differ¬ 
ence  between  [A  (N)/(T],iv.nry  and  [A  (A)/cr]cxpt  at  the  N  val¬ 
ues  that  were  used  in  the  experiment.  The  resemblance  be¬ 
tween  a  model  and  the  experimental  data  is  quantitatively 
represented  by  a  correlation  coefficient  (Hays,  1994)  com¬ 
puted  from  the  A  (N)/a  values  of  the  best-fit  model  and  the 
data  at  the  experimental  N  values. 

The  linear  summation  model  computes  the  function 

N 

A(x)  =  (1W)2  (x,+A),  (A3) 

;=  l 

where  A  is  the  sampled  signal  value  and  (x,  ;/  =  1,...,A}  are 
independent  identically  distributed  noise  samples  (one  from 
each  echo)  with  zero  mean  and  variance  cr2;  £(x;)  =  0  and 
E(xiXj)  =  o2  if  Xj=Xj,  and  zero  otherwise.  E(x)  is  the  en¬ 
semble  expected  value  of  x,  and  x  is  the  set  of  noise  samples 
X|  ,x2,...,xjV.  It  follows  that 

N 

E[f >(x)]  =  (l/A)E  E(x,+A)=A,  (A4) 

/  =  1 


and 


N  N 


£[/,2(x)]  =  (l/A2)E  E  E[(x,+A)(Xj+A)] 

i- 1  J=' 


=  (1/A2) 


E  ehx.+a)1} 


1=1 


+  E  E  E[(x,+A)(Xj+A)] 

i=\  j*i 


==  (cr  /  N)  +  A2 . 


(A5) 


The  variance  of  the  averaged,  linearly  transformed  data  is 
then 


Var[A(x)]  =  £[A2(x)]-£2[/i(x)]  =  cr2/A.  (A6) 

When  hypothesis  HI  is  true,  the  data  consist  of  signal 
plus  noise  with  A  #  0.  If  HO  is  true,  the  data  consist  of  noise 
alone  (A  =  0).  The  corresponding  detectability  index  is 

\E[h(x_)\m)-E[h(x)\H0]\ 
d‘  [(l/2){Var[^(x)|//0]+Var[A(x)|Al]}]1/2 

=  'JnAIct.  (A7) 


In  the  psychophysical  literature,  d’  is  used  instead  of  d.  The 
prime  is  omitted  here  in  order  to  simplify  notation  in  the 
equations. 

The  linear  summation  model  is  evaluated  by  adjusting 
the  constant  ct=  dt  in  the  equation 

[zf(A)/<r]|in=c,/VA,  (A8) 
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to  obtain  a  minimum  mean-square  error  (MMSE)  fit  of 
[^(JV)/o-],in  to  [A{N)lcr],m  for  the  experimental  values  of 
N.  The  resulting  best  fit  is  then  evaluated  via  the  correlation 
coefficient  between  [A{N)lcr\Vm  and  [^(A0/o-]cxp,. 

For  an  average  of  squared  data  samples  (energy  detec¬ 
tion) 


To  obtain  a  real- valued  (N)/cr,  the  plus-or-minus  operation 
in  (A  17)  must  always  be  plus,  and 

[^(A0/o-]cgy=(Ce/VA0[l  +  Jl+2(ce/jN)~2]m, 

(A  18) 


N 


h(x)  =  (1/AOS  (Xi+A)2, 

/*=  I 


(A9) 


N 


-£"[ /j(x)]  ==  ( 1/A7) E(x,+A)2  =  <t2  +  A2, 
/=  1 


(A10) 


and 


N  N 

£[*2(*)]  =  (1/A2)S  E  E[(Xi+A)2(Xj+A)2] 

<-  I  7=1 


=  (1  IN2) 


N 


2 

;=  i 


£[(*,+^)4] 


N 

+  E  2  E[{Xi+A)2(Xj+A)2} 

i~  l  j*i 


(All) 


where 

E[(Xi+A)*]=E(x*)  +  4E(x?)A  +  6E(x2)A2 

+  4E{Xi)A'i+A\  (A  12) 

For  a  zero-mean  Gaussian  random  variable,  £(x,)  =  0, 
E(x2)  =  cr2,  E(x?)  =  0,  and  £(x4)  =  3tr4.  It  follows  that 


where  ce=de.  To  evaluate  the  general  square-law  model,  the 
constant  ce  in  (A  18)  is  adjusted  to  obtain  an  MMSE  fit  of 
[A(N)/cr]cgy  to  [A  (N)/cr]cx[)t  for  the  experimental  values  of 
N. 

For  the  M-out-of-A  receiver  model,  the  binomial  distri¬ 
bution  (Papoulis,  1965)  describes  the  probabilities  of  various 
numbers  of  ones  and  zeros  at  the  output  of  the  binary  quan¬ 
tizer  for  N  echoes.  Let  p]  equal  the  probability  that  the  bi¬ 
nary  random  variable  equals  1  (the  sampled  data  value  is 
greater  than  the  binary  quantization  threshold,  yh)  when  an 
echo  is  present  (the  signal  plus  noise  condition,  HI).  Let  p0 
equal  the  probability  that  the  binary  random  variable  is  1 
when  the  echo  is  absent  (the  noise  alone  condition,  HO).  For 
one  binary  sample  from  each  of  N  echoes 

The  expected  number  of  “ones”  with  echo  present  (HI) 
equals  Np j . 

The  expected  number  of  “ones”  with  echo  absent  (HO) 
equals  Np0. 

The  variance  of  the  distribution  of  the  number  of  “ones” 
given  HI  equals  Np]  (1—  p{). 

The  variance  of  the  distribution  of  the  number  of  “ones” 
given  HO  equals  Np0  (1  ~p0). 

The  detectability  index  is  the  difference  in  means  divided  by 
the  square  root  of  the  average  variance. 


E[  h 2  ( x )  ]: =  ( 1  IN )  ( 3  a4  +  6  <t2A 2 + A 4 ) 

+  [(fV—  1)/A](cr2-M2)2,  (A  13) 

and 

Var[/i(jc)]  =  £'[/i2(x)]  — jE,2[/i(x)] 

—  {2(t2  /  N)(a2  +  2A2) .  (A14) 

The  detectability  index  is  then 

\E[h{x)\m-\-E[h(x)\HQ}\ 
de  [( l/2){Var[/i(x)|//0]  +  Var[/i(x)| //!]}] 1/2 


A  = _ Np^-Np0 _ 

*  V(l/2)[APl(l-p,)+^0(l-Po)]' 


(A  19) 


As  in  the  previous  models  described  by  (A8)  and  (A15),  the 
detectability  index  is  proportional  to  the  square  root  of  N  if 
p  i  is  constant  (constant  SNR). 

If  a  threshold  value  yh  is  used  to  convert  echo  samples 
into  binary  data,  the  probability  that  the  binary  random  vari¬ 
able  equals  1  when  the  signal  is  absent  (noise  alone)  is 


/>0  =  erfc*(y/cr), 


(A20) 


=  4nI 2- 


(A  hr)2 
V 1  +(A/cr)2 


and  the  probability  that  the  binary  random  variable  equals  1 
when  the  signal  is  present  is 


=  yjN/2\A/a\  at  high  SNR  (|,4/oj§>/) 

—  \[n/2(A/ cr)2  at  low  SNR  (\Ala<l).  (A15) 

For  a  general  square-law  model  with  no  assumptions  about 
SNR,  the  first  equation  in  (A  15)  can  be  written 

{NI2)X2-d]X-d]  =  0,  (A  16) 

where  x  =  (A/cr)2.  Solving  for  x  yields 

A(N)/tr=  [( d2JN )  ±  (dc  /N)  sjd]  +  2 N] m.  (A  17) 


Pi=erfc*{[y-^(A)]/o-},  (A21) 

where  a  is  the  rms  noise  power  and  the  complementary  error 
function  erfc^x)  is  the  integral  of  a  zero-mean,  unit  variance 
normal  distribution  between  x  and  infinity. 

From  (A20),  the  threshold  level  is 

y=o-erfc”l(p0),  (A22) 

where  erfc~'(p0)  is  the  inverse  complementary  error  function 
of  p 0  as  in  Fig.  5.  Similarly,  (A21)  can  be  solved  forA(N): 
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A{N)=y—cre.rfcitx{p{). 
Substituting  (A22)  into  (A23), 


(A23) 


[A  ( A)/ff]bin = erfc^ 1  (p  0)  -  erfc* 1  (p , ) .  (A24) 

Letting  ch  =  (  1/2)  c/b  and  solving  (A19)  for  p  t  yields 


(2Po  +  cb  >N) ±  V(2p0  +  Cb  M)1  ~  4/>0(  1  +  ch  /N)[p0(  1  +  ch  /A)  -  ch  /A] 

2(1  +ch/N) 


(A25) 


The  plus-or-minus  operation  in  (A25)  must  always  be  plus  in 
order  for  p,  to  be  non-negative.  For  a  given  level  of  detec¬ 
tion  performance  (e.g.,  a  given  percentage  of  correct  deci¬ 
sions),  the  detectability  index  is  constant  and  the  parameters 
ch  and  p o  in  (A24)-(A25)  can  be  adjusted  to  obtain  an 
MMSE  fit  of  (>4 (AQ/ cr]hin  to  |y4(A)/a-]CX[lI.  The  resulting 
values  of  [A(N)I cr\hm  yield  a  much  better  fit  to  the  dolphin 
data  than  can  be  obtained  via  linear  summation  or  energy 
detection  models. 

The  binary  M-out-of-A  receiver  can  be  easily  compared 
with  linear  summation  for  small  SNR,  which  corresponds  to 
a  large  number  A  of  available  echoes.  For  zero  binary  quan¬ 
tization  threshold  ( yA  =  0 )  and  for  small  SNR 

p0  =  1/2,  (A26) 

and 


Pi==erfc*[(yA— A)/ a] 


A/a 


(27 r)  l/2exp(-y2/2)£/y 


[Ala 

=  (l/2)+  (2  7r)  ~ 1/2  exp(  —y2/2)dy 


555 Po  +  (A/ar )  (dldx)  f  (27r)  l/2exp(—  y2/2)dy 

Jo  Lo 

=p0  +  U/(\l2n<r)]-  (A27) 

Substituting  (A26)  and  (A27)  into  (A19)  yields 
\[NA/cr 

dh~- -(2/77)'^,.  (A28) 

V7t[(1/2)  —  {A!  \12i rcr)2] 


At  low  SNR  (large  N)y  dh^0.8dt .  The  performance  of  the 
binary  M-out-of-A  receiver  is  slightly  worse  than  that  of  the 
linear  summation  receiver  for  a  large  number  of  echoes,  and 
a  slightly  larger  SNR  should  be  required  for  detection.  This 
comparison  pertains  to  additive  zero-mean  Gaussian  noise 
with  known  variance  (known  expected  noise  power).  For 
non-Gaussian  and/or  nonstationary  noise,  the  binary 
M-out-of-A  receiver  may  be  superior  to  linear  summation. 

For  a  small  number  of  echoes,  performance  prediction  of 
the  binary  M-out-of-A/  model  involves  a  nonlinear  transfor¬ 
mation  that  greatly  increases  the  required  SNR.  The  binary 
summation  data  fit  illustrated  in  Fig.  6  corresponds  to  p0 
=  1/2  and  cb  =  0.9999993  in  (A25).  The  parameter  ch  is 
written  with  high  accuracy  because  a  small  change  in  ch  is 


associated  with  a  large  change  in  erfic^'/p,)  when  p,  is  close 
to  unity.  Substituting  p0  =  1/2,  and  cA=  1 .0  into  (A25),  re¬ 
sults  in  the  equation 

p,=(l/2)  +  [2(l  +A0]~,/2.  (A29) 

Substituting  this  expression  for  p  t  into  (A24)  with  p0  =  1/2 
yields 

[y4(A/)/a-]bin=  -erfc; 1  {( 1/2)  +  [2(  1  +N)]~' ,/2}.  (A30) 

The  argument  of  the  inverse  complementary  error  function  in 
(A30)  is  unity  when  N  equals  one  and  approaches  j  as  N 
becomes  very  large.  As  indicated  in  Fig.  5,  the  inverse 
complementary  error  function  is  unbounded  when  its  argu¬ 
ment  equals  zero,  decreases  monotonically  as  its  argument 
increases,  passes  through  zero  when  the  argument  is  j,  and 
goes  to  —oo  when  the  argument  equals  1, 

erfc“ '(0)  =  a>;  erfc~,(l/2)  =  0;  erfc“'(l)=  — oo. 

(A31) 

Since  ch  is  slightly  less  than  1  for  an  MMSE  fit  to  the  data, 
the  negative-inverse  complementary  error  function  is  not  un¬ 
bounded  for  N—  1,  but  is  very  large.  This  extremely  nonlin¬ 
ear  behavior  allows  the  binary  M-out-of-A/  model  to  closely 
approximate  the  large  SNR  required  by  the  dolphin  when 
only  one  echo  is  available  {N=  1). 

Receiver  comparisons  can  be  further  investigated  by 
ROC  (receiver  operating  characteristic)  computation.  The 
ROC  is  a  plot  of  detection  versus  false-alarm  probabilities 
for  various  threshold  settings.  For  the  binary  M-out-of-A 
detector,  threshold  settings  are  limited  to  integer  values  of  M 
between  1  and  N.  The  probabilities  of  detection  and  false 
alarm  for  a  given  M  value  are 

Pd,min=^m  -/>])*■*. 

Pi=erfc*[(y*-^)/o-],  (A32) 

*  IN\ 

PF,M/N=lZM\kjPo(l-Po)N  k ,  Po  =  erfc*(yb/o-). 

(A3  3) 

The  false-alarm  probability  is  independent  of  noise  power  cr2 
if  the  threshold  for  binary  quantization  yh  equals  zero.  For 
zero  yh ,  the  false-alarm  rate  of  the  binary  M-out-of-A  re¬ 
ceiver  can  be  changed  by  adjusting  the  number  of  binary 
threshold  crossings  M  that  are  required  for  detection.  As  M 
is  increased,  the  false-alarm  rate  decreases,  and  the  detection 
probability  also  decreases. 
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For  linear  summation 

PD,im=ertc*[\!H(y-A)/<r], 

(A3  4) 

^F,iin=erfc*(t//Vr/o-). 

(A3  5) 

Dependence  of  the  false-alarm  rate  on  noise  power  can  again 
be  eliminated  by  setting  the  threshold  of  the  linear  receiver 
equal  to  zero.  In  the  linear  case,  however,  this  threshold  set¬ 
ting  yields  a  large  false-alarm  rate  that  cannot  be  changed 
without  introducing  noise  dependence.  For  linear  summation 
with  a  constant  false-alarm  rate  that  is  unequal  to  0.5,  noise 
power  must  be  estimated  and  the  threshold  value  must  be 
adjusted  accordingly. 
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